Twitter on Drugs: Pharmaceutical Spam in Tweets

نویسندگان

  • Chandra Shekar
  • Kathy J. Liszka
  • Chien-Chung Chan
چکیده

Twitter presents a new forum for spammers to facilitate illegal pharmaceutical scams. We present a classification scheme using decision strategy and data mining techniques taking into account the unbalanced nature of the data set. Four classifiers are used to identify pharmaceutical spam tweets. Classifiers J48 and Random Tree (RT) are generated by Weka tools, and classifiers DL(J48) and DL(RT) are based on the combination of J48 and RT with the decision matrix. The classifiers were tested using manually labeled data sets collected at different time spans. Experimental results suggest that the combination of RT with the decision matrix provides a stable performance improvement over using standalone tree-based classifiers. Keywords-data mining; text mining; spam; pharmaceuticals; social networking; Twitter;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Pharmaceutical Spam in Microblog Messages

Microblogs are one of a growing group of social network tools. Twitter is, at present, one of the most popular forums for microblogging in online social networks, and the fastest growing. Fifty million messages flow through servers, computers, and cell phones on a wide variety of topics exchanged daily. With this considerable volume, Twitter is a natural and obvious target for spreading spam vi...

متن کامل

An analysis of 14 Million tweets on hashtag-oriented spamming

Over the years, Twitter has become a popular platform for information dissemination and information gathering. However, the popularity of Twitter has attracted not only legitimate users but also spammers who exploit social graphs, popular keywords, and hashtags for malicious purposes. In this paper, we present a detailed analysis of the HSpam14 dataset, which contains 14 million tweets with spa...

متن کامل

A Survey of Spam Detection Methods on Twitter

Twitter is one of the most popular social media platforms that has 313 million monthly active users which post 500 million tweets per day. This popularity attracts the attention of spammers who use Twitter for their malicious aims such as phishing legitimate users or spreading malicious software and advertises through URLs shared within tweets, aggressively follow/unfollow legitimate users and ...

متن کامل

Spammers Are Becoming "Smarter" on Twitter

T witter has become one of the most commonly used communication tools in daily life. With 500 million users, Twitter now generates more than 500 million tweets per day. However, its popularity has also attracted spamming. Spammers spread many intensive tweets, which can lure legitimate users to commercial or malicious sites containing malware downloads, phishing, drug sales, scams, and more.1 S...

متن کامل

Twitter Content-Based Spam Filtering

Twitter has become one of the most used social networks. And, as happens with every popular media, it is prone to misuse. In this context, spam in Twitter has emerged in the last years, becoming an important problem for the users. In the last years, several approaches have appeared that are able to determine whether an user is a spammer or not. However, these blacklisting systems cannot filter ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011